AITopics | reinforcement learner

Industry: Health & Medicine > Consumer Health (0.42)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.42)

Neural Information Processing SystemsDec-24-2025, 23:13:55 GMT

Generalization of Reinforcement Learners with Working and Episodic Memory

Memory is an important aspect of intelligence and plays a role in many deep reinforcement learning models. However, little progress has been made in understanding when specific memory systems help more than others and how well they generalize. The field also has yet to see a prevalent consistent and rigorous approach for evaluating agent performance on holdout data. In this paper, we aim to develop a comprehensive methodology to test different kinds of memory in an agent and assess how well the agent can apply what it learns in training to a holdout set that differs from the training set along dimensions that we suggest are relevant for evaluating memory-specific generalization. To that end, we first construct a diverse set of memory tasks that allow us to evaluate test-time generalization across multiple dimensions. Second, we develop and perform multiple ablations on an agent architecture that combines multiple memory systems, observe its baseline models, and investigate its performance against the task suite.

generalization, reinforcement learner, working and episodic memory, (3 more...)

Industry: Health & Medicine > Consumer Health (0.43)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Neural Information Processing SystemsOct-1-2025, 23:09:23 GMT

Paper: Generalization of Reinforcement Learners with Working and Episodic Memory

We thank the reviewers for their thoughtful and constructive feedback on our manuscript. This should help both contextualize each task's difficulty and illustrate what it involves. Reviewer 3 noted the Section 2 task descriptions could be better presented. We have reformatted it so that "the order We also changed our description of IMP ALA to match Reviewer 5's suggestion. Regarding the task suite, Reviewer 4 raised a thoughtful consideration on whether "most of the findings translate when Some 3D tasks in the suite already have '2D-like' semi-counterparts that do not require navigation, '2D-like' because everything is fully observable and the agent has a first-person point of view from a fixed point, without Spot the Difference level, was overall harder than Change Detection for our ablation models.

artificial intelligence, generalization, reinforcement learner, (16 more...)

Industry: Health & Medicine > Consumer Health (0.42)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.42)

Boyalakuntla, Kowndinya, Boularias, Abdeslam, Yu, Jingjin

KARL: Kalman-Filter Assisted Reinforcement Learner for Dynamic Object Tracking and Grasping

arXiv.org Artificial IntelligenceJun-23-2025

-- We present Kalman-filter Assisted Reinforcement Learner (KARL) for dynamic object tracking and grasping over eye-on-hand (EoH) systems, significantly expanding such systems' capabilities in challenging, realistic environments. In comparison to the previous state-of-the-art, KARL (1) incorporates a novel six-stage RL curriculum that doubles the system's motion range, thereby greatly enhancing the system's grasping performance, (2) integrates a robust Kalman filter layer between the perception and reinforcement learning (RL) control modules, enabling the system to maintain an uncertain but continuous 6D pose estimate even when the target object temporarily exits the camera's field-of-view or undergoes rapid, unpredictable motion, and (3) introduces mechanisms to allow retries to gracefully recover from unavoidable policy execution failures. Extensive evaluations conducted in both simulation and real-world experiments qualitatively and quantitatively corroborate KARL's advantage over earlier systems, achieving higher grasp success rates and faster robot execution speed. Source code and supplementary materials for KARL will be made available at: https://github.com/arc-l/karl . Humans, and animals in general, interact with the physical world through observing and handling everyday objects [1], which makes object tracking and manipulation arguably the most fundamental skill for physical intelligence. In robotics, autonomous grasping in stationary settings has been extensively studied [2], [3], typically using decoupled vision and manipulation sub-systems where the camera does not move with the manipulator. While effective for static tasks, this approach struggles in dynamic scenarios where objects move or become occluded. Real-world interactions, such as handovers, require continuous tracking and adaptive grasping, highlighting the need for more integrated solutions.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2506.15945

Country: North America > United States (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

Neural Information Processing SystemsJan-21-2025, 05:55:22 GMT

Reviews: Generalization of Reinforcement Learners with Working and Episodic Memory

The authors do a good job of motivating their work, and they contribute a nice experimental section with good results. The ablation study was thorough. Well done! --- Many tasks that might be given to an RL agent are impossible without working memory. This paper presents a suite of tasks which require use of that memory in order to succeed. These tasks are compiled from a variety of other sources, either directly or re-implemented for this suite.

agent, reinforcement learner, working and episodic memory, (6 more...)

Industry: Health & Medicine > Consumer Health (0.53)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.53)

Neural Information Processing SystemsJan-21-2025, 05:55:12 GMT

Reviews: Generalization of Reinforcement Learners with Working and Episodic Memory

The reviewers consider the task suite, the memory-augmented model, and the evaluations to be solid contributions. Please be sure to release the task suite...

generalization, reinforcement learner, working and episodic memory, (1 more...)

Industry: Health & Medicine > Consumer Health (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.40)

Neural Information Processing SystemsOct-9-2024, 10:39:10 GMT

Generalization of Reinforcement Learners with Working and Episodic Memory

Memory is an important aspect of intelligence and plays a role in many deep reinforcement learning models. However, little progress has been made in understanding when specific memory systems help more than others and how well they generalize. The field also has yet to see a prevalent consistent and rigorous approach for evaluating agent performance on holdout data. In this paper, we aim to develop a comprehensive methodology to test different kinds of memory in an agent and assess how well the agent can apply what it learns in training to a holdout set that differs from the training set along dimensions that we suggest are relevant for evaluating memory-specific generalization. To that end, we first construct a diverse set of memory tasks that allow us to evaluate test-time generalization across multiple dimensions. Second, we develop and perform multiple ablations on an agent architecture that combines multiple memory systems, observe its baseline models, and investigate its performance against the task suite.

generalization, reinforcement learner, working and episodic memory, (1 more...)

Industry: Health & Medicine > Consumer Health (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.40)

arXiv.org Artificial IntelligenceFeb-17-2024

Implementation of a Model of the Cortex Basal Ganglia Loop

Arakawa, Naoya

This article presents a simple model of the cortex-basal ganglia-thalamus loop, which is thought to serve for action selection and executions, and reports the results of its implementation. The model is based on the hypothesis that the cerebral cortex predicts actions, while the basal ganglia use reinforcement learning to decide whether to perform the actions predicted by the cortex. The implementation is intended to be used as a component of models of the brain consisting of cortical regions or brain-inspired cognitive architectures.

action selection, cortex, implementation, (12 more...)

2402.13275

Genre: Research Report (0.65)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (1.00)

Huang, Baichuan, Yu, Jingjin, Jain, Siddarth

EARL: Eye-on-Hand Reinforcement Learner for Dynamic Grasping with Active Pose Estimation

arXiv.org Artificial IntelligenceOct-10-2023

In this paper, we explore the dynamic grasping of moving objects through active pose tracking and reinforcement learning for hand-eye coordination systems. Most existing vision-based robotic grasping methods implicitly assume target objects are stationary or moving predictably. Performing grasping of unpredictably moving objects presents a unique set of challenges. For example, a pre-computed robust grasp can become unreachable or unstable as the target object moves, and motion planning must also be adaptive. In this work, we present a new approach, Eye-on-hAnd Reinforcement Learner (EARL), for enabling coupled Eye-on-Hand (EoH) robotic manipulation systems to perform real-time active pose tracking and dynamic grasping of novel objects without explicit motion prediction. EARL readily addresses many thorny issues in automated hand-eye coordination, including fast-tracking of 6D object pose from vision, learning control policy for a robotic arm to track a moving object while keeping the object in the camera's field of view, and performing dynamic grasping. We demonstrate the effectiveness of our approach in extensive experiments validated on multiple commercial robotic arms in both simulations and complex real-world tasks.

artificial intelligence, machine learning, reinforcement learning, (21 more...)

2310.06751

Country: North America > United States (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceDec-19-2021

Demanding and Designing Aligned Cognitive Architectures

Holtman, Koen

With AI systems becoming more powerful and pervasive, there is increasing debate about keeping their actions aligned with the broader goals and needs of humanity. This multi-disciplinary and multi-stakeholder debate must resolve many issues, here we examine three of them. The first issue is to clarify what demands stakeholders might usefully make on the designers of AI systems, useful because the technology exists to implement them. We make this technical topic more accessible by using the framing of cognitive architectures. The second issue is to move beyond an analytical framing that treats useful intelligence as being reward maximization only. To support this move, we define several AI cognitive architectures that combine reward maximization with other technical elements designed to improve alignment. The third issue is how stakeholders should calibrate their interactions with modern machine learning researchers. We consider how current fashions in machine learning create a narrative pull that participants in technical and policy discussions should be aware of, so that they can compensate for it. We identify several technically tractable but currently unfashionable options for improving AI alignment.

artificial intelligence, cognitive architecture, machine learning, (14 more...)

2112.1019

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Brabant > Eindhoven (0.04)

Genre: Research Report (0.50)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Cognitive Architectures (0.94)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)